17 research outputs found

    CAFTAN: a tool for fast mapping, and quality assessment of cDNAs

    Get PDF
    Background: The German cDNA Consortium has been cloning full length cDNAs and continued with their exploitation in protein localization experiments and cellular assays. However, the efficient use of large cDNA resources requires the development of strategies that are capable of a speedy selection of truly useful cDNAs from biological and experimental noise. To this end we have developed a new high-throughput analysis tool, CAFTAN, which simplifies these efforts and thus fills the gap between large-scale cDNA collections and their systematic annotation and application in functional genomics. Results: CAFTAN is built around the mapping of cDNAs to the genome assembly, and the subsequent analysis of their genomic context. It uses sequence features like the presence and type of PolyA signals, inner and flanking repeats, the GC-content, splice site types, etc. All these features are evaluated in individual tests and classify cDNAs according to their sequence quality and likelihood to have been generated from fully processed mRNAs. Additionally, CAFTAN compares the coordinates of mapped cDNAs with the genomic coordinates of reference sets from public available resources (e.g., VEGA, ENSEMBL). This provides detailed information about overlapping exons and the structural classification of cDNAs with respect to the reference set of splice variants. The evaluation of CAFTAN showed that is able to correctly classify more than 85% of 5950 selected "known protein-coding" VEGA cDNAs as high quality multi- or single-exon. It identified as good 80.6 % of the single exon cDNAs and 85 % of the multiple exon cDNAs. The program is written in Perl and in a modular way, allowing the adoption of this strategy to other tasks like EST-annotation, or to extend it by adding new classification rules and new organism databases as they become available. We think that it is a very useful program for the annotation and research of unfinished genomes. Conclusion: CAFTAN is a high-throughput sequence analysis tool, which performs a fast and reliable quality prediction of cDNAs. Several thousands of cDNAs can be analyzed in a short time, giving the curator/scientist a first quick overview about the quality and the already existing annotation of a set of cDNAs. It supports the rejection of low quality cDNAs and helps in the selection of likely novel splice variants, and/or completely novel transcripts for new experiments.German Federal Ministry of Education and Research 01GR0101 and 01GR0420 and 01GR045

    Rhomboid Protease Dynamics and Lipid Interactions

    Get PDF
    Intramembrane proteases, which cleave transmembrane (TM) helices, participate in numerous biological processes encompassing all branches of life. Several crystallographic structures of Escherichia coli GlpG rhomboid protease have been determined. In order to understand GlpG dynamics and lipid interactions in a native-like environment, we have examined the molecular dynamics of wild-type and mutant GlpG in different membrane environments. The irregular shape and small hydrophobic thickness of the protein cause significant bilayer deformations that may be important for substrate entry into the active site. Hydrogen-bond interactions with lipids are paramount in protein orientation and dynamics. Mutations in the unusual L1 loop cause changes in protein dynamics and protein orientation that are relayed to the His-Ser catalytic dyad. Similarly,mutations in TM5 change the dynamics and structure of the L1 loop. These results imply that the L1 loop has an important regulatory role in proteolysis.National Institute of General Medical Sciences (GM-74637

    Uncovering the complex genetic architecture of human plasma lipidome using machine learning methods

    Get PDF
    Genetic architecture of plasma lipidome provides insights into regulation of lipid metabolism and related diseases. We applied an unsupervised machine learning method, PGMRA, to discover phenotype-genotype many-to-many relations between genotype and plasma lipidome (phenotype) in order to identify the genetic architecture of plasma lipidome profiled from 1,426 Finnish individuals aged 30–45 years. PGMRA involves biclustering genotype and lipidome data independently followed by their inter-domain integration based on hypergeometric tests of the number of shared individuals. Pathway enrichment analysis was performed on the SNP sets to identify their associated biological processes. We identified 93 statistically significant (hypergeometric p-value < 0.01) lipidomegenotype relations. Genotype biclusters in these 93 relations contained 5977 SNPs across 3164 genes. Twenty nine of the 93 relations contained genotype biclusters with more than 50% unique SNPs and participants, thus representing most distinct subgroups. We identified 30 significantly enriched biological processes among the SNPs involved in 21 of these 29 most distinct genotype-lipidome subgroups through which the identified genetic variants can influence and regulate plasma lipid related metabolism and profiles. This study identified 29 distinct genotype-lipidome subgroups in the studied Finnish population that may have distinct disease trajectories and therefore could be useful in precision medicine research.Research Council of FinlandSocial Insurance Institution of FinlandCompetitive State Research Financing of Expert Responsibility area of Kuopio, Tampere and Turku University HospitalsJuho Vainio FoundationPaavo Nurmi FoundationFinnish Foundation for Cardiovascular ResearchFinnish Cultural Foundation Finnish IT center for scienceSigrid Juselius FoundationTampere Tuberculosis FoundationEmil Aaltonen FoundationYrjo Jahnsson FoundationSigne and Ane Gyllenberg FoundationDiabetes Research Foundation of Finnish Diabetes Association 322098 286284 134309 126925 121584 124282 255381 256474 283115 319060 320297 314389 338395 330809 104821 129378 117797 141071 INFRAIA-2016-1-730897Horizon 2020European Research Council (ERC) European Commission 349708Tampere University Hospital Supporting FoundationFinnish Society of Clinical ChemistrySpanish Government RTI2018-098983-B-100Laboratoriolaaketieteen Edistamissaatio~SrIda Montinin saatioKalle Kaiharin saatioAarne Koskelon saatioFaculty of Medicine and Health Technology, Tampere UniversityProject HPC-EUROPA3 X51001 50191928EC Research Innovation Action under H2020 Programme 75532

    Optimization of multi-classifiers for computational biology: application to gene finding and expression

    Get PDF
    Genomes of many organisms have been sequenced over the last few years. However, transforming such raw sequence data into knowledge remains a hard task. A great number of prediction programs have been developed to address part of this problem: the location of genes along a genome and their expression. We propose a multi-objective methodology to combine state-of-the-art algorithms into an aggregation scheme in order to obtain optimal methods’ aggregations. The results obtained show a major improvement in sensitivity when our methodology is compared to the performance of individual methods for gene finding and gene expression problems. The methodology proposed here is an automatic method generator, and a step forward to exploit all already existing methods, by providing alternative optimal methods’ aggregations to answer concrete queries for a certain biological problem with a maximized accuracy of the prediction. As more approaches are integrated for each of the presented problems, de novo accuracy can be expected to improve further.Ministry of Science and Innovation, Spain (MICINN) Spanish Government TIN-2006-12879Junta de Andalucia TIC-02788Howard Hughes Medical InstituteEuropean Commission Junta de Andaluci

    Identification of differentially expressed small non-coding RNAs in the legume endosymbiont Sinorhizobium meliloti by comparative genomics

    Get PDF
    Bacterial small non-coding RNAs (sRNAs) are being recognized as novel widespread regulators of gene expression in response to environmental signals. Here, we present the first search for sRNA-encoding genes in the nitrogen-fixing endosymbiont Sinorhizobium meliloti, performed by a genome- wide computational analysis of its intergenic regions. Comparative sequence data from eight related alpha-proteobacteria were obtained, and the interspecies pairwise alignments were scored with the programs eQRNA and RNAz as complementary predictive tools to identify conserved and stable secondary structures corresponding to putative non-coding RNAs. Northern experiments confirmed that eight of the predicted loci, selected among the original 32 candidates as most probable sRNA genes, expressed small transcripts. This result supports the combined use of eQRNA and RNAz as a robust strategy to identify novel sRNAs in bacteria. Furthermore, seven of the transcripts accumulated differentially in free-living and symbiotic conditions. Experimental mapping of the 5 '-ends of the detected transcripts revealed that their encoding genes are organized in autonomous transcription units with recognizable promoter and, in most cases, termination signatures. These findings suggest novel regulatory functions for sRNAs related to the interactions of alpha-proteobacteria with their eukaryotic hosts.Spanish Ministerio de Educación y Ciencia (Project AGL2006-12466/AGR)Junta de Andalucía (Project CV1-01522)NIH Grant 1R01GM070538-02FPI Fellowship from the Spanish Ministerio de Educación y Cienci

    Temperament & Character account for brain functional connectivity at rest: A diathesis-stress model of functional dysregulation in psychosis

    Get PDF
    The online version contains supplementary material available at https://doi.org/10.1038/s41380-023-02039-6The human brain’s resting-state functional connectivity (rsFC) provides stable trait-like measures of differences in the perceptual, cognitive, emotional, and social functioning of individuals. The rsFC of the prefrontal cortex is hypothesized to mediate a person’s rational self-government, as is also measured by personality, so we tested whether its connectivity networks account for vulnerability to psychosis and related personality configurations. Young adults were recruited as outpatients or controls from the same communities around psychiatric clinics. Healthy controls (n = 30) and clinically stable outpatients with bipolar disorder (n = 35) or schizophrenia (n = 27) were diagnosed by structured interviews, and then were assessed with standardized protocols of the Human Connectome Project. Data-driven clustering identified five groups of patients with distinct patterns of rsFC regardless of diagnosis. These groups were distinguished by rsFC networks that regulate specific biopsychosocial aspects of psychosis: sensory hypersensitivity, negative emotional balance, impaired attentional control, avolition, and social mistrust. The rsFc group differences were validated by independent measures of white matter microstructure, personality, and clinical features not used to identify the subjects. We confirmed that each connectivity group was organized by differential collaborative interactions among six prefrontal and eight other automatically-coactivated networks. The temperament and character traits of the members of these groups strongly accounted for the differences in rsFC between groups, indicating that configurations of rsFC are internal representations of personality organization. These representations involve weakly self-regulated emotional drives of fear, irrational desire, and mistrust, which predispose to psychopathology. However, stable outpatients with different diagnoses (bipolar or schizophrenic psychoses) were highly similar in rsFC and personality. This supports a diathesis-stress model in which different complex adaptive systems regulate predisposition (which is similar in stable outpatients despite diagnosis) and stress-induced clinical dysfunction (which differs by diagnosis).EU FEDER grants through the Spanish Ministry of Science and Technology PID2021-125017OB-I00, RTI2018-098983-B-I00, D43 TW011793-06A1, PID2021-125017OB-I00, RTI2018-098983-B-I00, D43 TW011793-06A1United States Department of Health & Human Services National Institutes of Health (NIH) - USA R01-MH124060Psychosis-Risk Outcomes Network U01 MH12463

    Gene network downstream plant stress response modulated by peroxisomal H2O2

    Get PDF
    Reactive oxygen species (ROS) act as secondary messengers that can be sensed by specific redox-sensitive proteins responsible for the activation of signal transduction culminating in altered gene expression. The subcellular site, in which modifications in the ROS/oxidation state occur, can also act as a specific cellular redox network signal. The chemical identity of ROS and their subcellular origin is actually a specific imprint on the transcriptome response. In recent years, a number of transcriptomic studies related to altered ROS metabolism in plant peroxisomes have been carried out. In this study, we conducted a metaanalysis of these transcriptomic findings to identify common transcriptional footprints for plant peroxisomal-dependent signaling at early and later time points. These footprints highlight the regulation of various metabolic pathways and gene families, which are also found in plant responses to several abiotic stresses. Major peroxisomal-dependent genes are associated with protein and endoplasmic reticulum (ER) protection at later stages of stress while, at earlier stages, these genes are related to hormone biosynthesis and signaling regulation. Furthermore, in silico analyses allowed us to assign human orthologs to some of the peroxisomal-dependent proteins, which are mainly associated with different cancer pathologies. Peroxisomal footprints provide a valuable resource for assessing and supporting key peroxisomal functions in cellular metabolism under control and stress conditions across species.Spanish Ministry of Science, Innovation and Universities (MCIU)State Research Agency (AEI)FEDER grant PGC2018-098372-B-I00MCIU Research Personnel Training (FPI) grant BES-2016-07651

    Evolution of genetic networks for human creativity

    Get PDF
    The genetic basis for the emergence of creativity in modern humans remains a mystery despite sequencing the genomes of chimpanzees and Neanderthals, our closest hominid relatives. Data-driven methods allowed us to uncover networks of genes distinguishing the three major systems of modern human personality and adaptability: emotional reactivity, self-control, and self-awareness. Now we have identified which of these genes are present in chimpanzees and Neanderthals. We replicated our findings in separate analyses of three high-coverage genomes of Neanderthals. We found that Neanderthals had nearly the same genes for emotional reactivity as chimpanzees, and they were intermediate between modern humans and chimpanzees in their numbers of genes for both self-control and self-awareness. 95% of the 267 genes we found only in modern humans were not protein-coding, including many long-non-coding RNAs in the self-awareness network. These genes may have arisen by positive selection for the characteristics of human well-being and behavioral modernity, including creativity, prosocial behavior, and healthy longevity. The genes that cluster in association with those found only in modern humans are over-expressed in brain regions involved in human self-awareness and creativity, including late-myelinating and phylogenetically recent regions of neocortex for autobiographical memory in frontal, parietal, and temporal regions, as well as related components of cortico-thalamo-ponto-cerebellar-cortical and cortico-striato-cortical loops. We conclude that modern humans have more than 200 unique non-protein-coding genes regulating co-expression of many more proteincoding genes in coordinated networks that underlie their capacities for self-awareness, creativity, prosocial behavior, and healthy longevity, which are not found in chimpanzees or Neanderthals

    Analyzing gender disparities in STEAM: A Case Study from Bioinformatics Workshops in the University of Granada

    Get PDF
    La bioinformática es un área interdisciplinaria que ha despertado un gran interés tanto para el mundo académico como para las corporaciones en los últimos años. Esta área creciente combina conocimientos y habilidades de las áreas de biología y ciencia, tecnología, ingeniería, artes y matemáticas (STEM). Una de las ventajas de la sinergia entre estas dos áreas de trabajo es que ofrece una oportunidad para cerrar la brecha de género de STEM tradicional. A pesar de esta oportunidad y la importancia y amplia aplicación del campo de la bioinformática, este tema aún no ha ganado suficiente visibilidad en los programas de posgrado para los títulos de bachillerato en la Universidad de Granada. Esto ha motivado la organización de un "Taller educativo sobre bioinformática" anual en la Universidad de Granada por el Departamento de Ciencias de la Computación e Inteligencia Artificial. Los resultados del análisis de las dos primeras ediciones de este taller muestran un gran interés en el tema por la comunidad universitaria en todos los niveles (por ejemplo, estudiantes de pregrado y posgrado, docentes e investigadores) sin distinción significativa entre los géneros a nivel global. Al analizar el grupo de estudiantes, las mujeres mostraron un mayor interés en el tema. Sin embargo, este interés no se reflejó en los estratos universitarios superiores (docentes e investigadores), que representan un vistazo de la situación actual general española en el área.Bioinformatics is an interdisciplinary area that has raised a high interest for both academia and corporations in recent years. This rising area combines knowledge and skills from Bio and Science, Technology, Engineering, Arts and Mathematics (STEM) areas. One of the advantages of the synergy between these two work areas is that it offers an opportunity for closing the traditional STEM's gender gap. Despite this opportunity and the signi cance and wide application of bioinformatics eld, this topic has still not gained enough visibility in the graduate programs for the Bio Bachelor Degrees at the University of Granada. This has motivated the organization of an annual \Educational Workshop on Bioinformatics" at the University of Granada by the Department of Computer Science and Arti cial Intelligence. Results of the analysis of the rst two editions of this workshop show a great interest on the topic by the university community at all levels (e.g. undergraduate and graduate students, teachers and researchers) without signi cant distinction among genders at global level. When analyzing student group, women did show a higher interest on the subject. However, this interest was not reflected in the higher university strata (teachers and researchers), which represents a glimpse of the spanish general current situation on the area.Universidad de Granada: Departamento de Arquitectura y Tecnología de Computadore

    Identification of novel prostate cancer genes in patients stratified by Gleason classification: Role of antitumoral genes

    Get PDF
    Spanish Ministry of Science and Innovation, Grant/Award Number: PRE2019-089807; Spanish Ministry of Science and Technology, Grant/Award Numbers: PI15/00914, RTI2018-098983-B-100; Universidad de Granada/CBUAProstate cancer (PCa) is a tumor with a great heterogeneity, both at a molecular and clinical level. Despite its global good prognosis, cases can vary from indolent to lethal metastatic and scientific efforts are aimed to discern those with worse outcomes. Current prognostic markers, as Gleason score, fall short when it comes to distinguishing these cases. Identification of new early biomarkers to enable a better PCa distinction and classification remains a challenge. In order to identify new genes implicated in PCa progression we conducted several differential gene expression analyses over paired samples comparing primary PCa tissue against healthy prostatic tissue of PCa patients. The results obtained show that this approach is a serious alternative to overcome patient heterogeneity. We were able to identify 250 genes whose expression varies along with tissue differentiation—healthy to tumor tissue, 161 of these genes are described here for the first time to be related to PCa. The further manual curation of these genes allowed to annotate 39 genes with antitumoral activity, 22 of them described for the first time to be related to PCa proliferation and metastasis. These findings could be replicated in different cohorts for most genes. Results obtained considering paired differential expression, functional annotation and replication results point to: CGREF1, UNC5A, C16orf74, LGR6, IGSF1, QPRT and CA14 as possible new early markers in PCa. These genes may prevent the progression of the disease and their expression should be studied in patients with different outcomes.Spanish Government PRE2019-089807 PI15/00914 RTI2018-098983-B-100Universidad de Granada/CBU
    corecore